An API for Discourse-level Access to XML-encoded Corpora

نویسندگان

  • Christoph Müller
  • Michael Strube
چکیده

We describe a simple and efficient Java object model and application programming interface (API) for (possibly multi-modal) annotated natural language corpora. Corpora are represented as elements like Sentences, Turns, Utterances, Words, Gestures and Markables. The API allows linguists to access corpora in terms of these discourse-level elements, i.e. at a conceptual level they are familiar with, with the flexibility offered by a general purpose programming language. It is also a contribution to corpus standardization efforts because it is based on a straightforward and easily extensible data model which can serve as a target for conversion of different corpus formats.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

JAX-RX - Unified REST Access to XML Resources

i REST nowadays represents, besides SOAP, one common way to access distributed resources in a web-affine manner. While SOAP can be easily utilized by high-level programming languages like Java (e.g. JAX-WS as one common standardized way), REST catches up regarding straight usages (e.g JAX-RS regarding Java). With the clean and direct usage of JAX-RS, common layers for standard-ised access on he...

متن کامل

Step by step: underspecified markup in incremental rhetorical analysis

While quite a few linguistic corpora with syntactic annotations are available today, resources are scarce on the level of discourse annotation. A flexible, extendible annotation format speeds up development. We therefore propose an XML format for annotating rhetorical structure trees. In human and automatic analysis, rhetorical structure is often difficult and assigned incrementally. Thus, the ...

متن کامل

RBAC Policies in XML for X.509 Based Privilege Management

This paper describes a role based access control policy template for use by privilege management infrastructures where the roles are stored as X.509 Attribute Certificates in an LDAP directory. There is a brief description of the X.509 privilege management model, and how it can be used to implement RBAC. Policies that conform to the template are written in XML, and the template is specified as ...

متن کامل

Access and Mobility Policy Control at the Network Edge

The fifth generation (5G) system architecture is defined as service-based and the core network functions are described as sets of services accessible through application programming interfaces (API). One of the components of 5G is Multi-access Edge Computing (MEC) which provides the open access to radio network functions through API. Using the mobile edge API third party analytics applications ...

متن کامل

Solving Crossword Puzzles via the Google API

The GoogleTM API enables software agents to query and use search results from the large collections of data available via the ever-popular Google search engine. Web searches using Google are exposed to over 4 billion pages, many of which are cached within Google. While the Google API may be used to produce customized user interfaces to Google, the API also provides direct programmatic access to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002